Speaker-adapted training on the Switchboard Corpus

نویسندگان

  • John W. McDonough
  • Tasos Anastasakos
  • George Zavaliagkos
  • Herbert Gish
چکیده

Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an e ective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesireable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. Very recently, there has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we shall investigate speci c issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Single-pass adapted training with all-pass transforms

In recent work, the all-pass transform (APT) was proposed as the basis of a speaker adaptation scheme intended for use with a large vocabulary speech recognition system. It was shown that APT-based adaptation reduces to a linear transformation of cepstral means, much like the better known maximum likelihood linear regression (MLLR), but is specified by far fewer free parameters. Due to its line...

متن کامل

Speaker recognition with the Switchboard corpus

In this paper we present our development work carried out in preparation for the March’96 speaker recognition test on the Switchboard corpus organized by NIST. The speaker verification system evaluated was a Gaussian mixture model. We provide experimental results on the development test and evaluation test data, and some experiments carried out since the evaluation comparing the GMM with a phon...

متن کامل

Kernel Alignment Maximization for Speaker Recognition Based on High-Level Features

In this paper text-independent automatic speaker verification based on support vector machines is considered. A generalized linear kernel training method based on kernel alignment maximization is proposed. First, kernel matrix decomposition into a sum of maximally aligned directions in the input space is performed and this decomposition is spectrally optimized. The method was evaluated for high...

متن کامل

Recent improvements in voicemail transcription

In this paper we report recent improvements in voicemail transcription. The voicemail transcription task was introduced last year [1] as representing a style of conversational telephone speech that is somewhat different from the Switchboard and CallHome [2] databases. Last year, the speaker independent and speaker adapted word error rates (WER) on this task were reported at 41.94% and 38.18% re...

متن کامل

Improving English Conversational Telephone Speech Recognition

The goal of this work is to build a state-of-the-art English conversational telephone speech recognition system. We investigated several techniques to improve acoustic modeling, namely speaker-dependent bottleneck features, deep Bidirectional Long Short-Term Memory (BLSTM) recurrent neural networks, data augmentation and score fusion of DNN and BLSTM models. Training set consisted of the 300 ho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997